80 research outputs found

    Large Social Networks can be Targeted for Viral Marketing with Small Seed Sets

    Full text link
    In a "tipping" model, each node in a social network, representing an individual, adopts a behavior if a certain number of his incoming neighbors previously held that property. A key problem for viral marketers is to determine an initial "seed" set in a network such that if given a property then the entire network adopts the behavior. Here we introduce a method for quickly finding seed sets that scales to very large networks. Our approach finds a set of nodes that guarantees spreading to the entire network under the tipping model. After experimentally evaluating 31 real-world networks, we found that our approach often finds such sets that are several orders of magnitude smaller than the population size. Our approach also scales well - on a Friendster social network consisting of 5.6 million nodes and 28 million edges we found a seed sets in under 3.6 hours. We also find that highly clustered local neighborhoods and dense network-wide community structure together suppress the ability of a trend to spread under the tipping model

    Spatio-Temporal Reasoning About Agent Behavior

    Get PDF
    There are many applications where we wish to reason about spatio-temporal aspects of an agent's behavior. This dissertation examines several facets of this type of reasoning. First, given a model of past agent behavior, we wish to reason about the probability that an agent takes a given action at a certain time. Previous work combining temporal and probabilistic reasoning has made either independence or Markov assumptions. This work introduces Annotated Probabilistic Temporal (APT) logic which makes neither assumption. Statements in APT logic consist of rules of the form "Formula G becomes true with a probability [L,U] within T time units after formula F becomes true'' and can be written by experts or extracted automatically. We explore the problem of entailment - finding the probability that an agent performs a given action at a certain time based on such a model. We study this problem's complexity and develop a sound, but incomplete fixpoint operator as a heuristic - implementing it and testing it on automatically generated models from several datasets. Second, agent behavior often results in "observations'' at geospatial locations that imply the existence of other, unobserved, locations we wish to find ("partners"). In this dissertation, we formalize this notion with "geospatial abduction problems" (GAPs). GAPs try to infer a set of partner locations for a set of observations and a model representing the relationship between observations and partners for a given agent. This dissertation presents exact and approximate algorithms for solving GAPs as well as an implemented software package for addressing these problems called SCARE (the Spatio-Cultural Abductive Reasoning Engine). We tested SCARE on counter-insurgency data from Iraq and obtained good results. We then provide an adversarial extension to GAPs as follows: given a fixed set of observations, if an adversary has probabilistic knowledge of how an agent were to find a corresponding set of partners, he would place the partners in locations that minimize the expected number of partners found by the agent. We examine this problem, along with its complement by studying their computational complexity, developing algorithms, and implementing approximation schemes. We also introduce a class of problems called geospatial optimization problems (GOPs). Here the agent has a set of actions that modify attributes of a geospatial region and he wishes to select a limited number of such actions (with respect to some budget and other constraints) in a manner that maximizes a benefit function. We study the complexity of this problem and develop exact methods. We then develop an approximation algorithm with a guarantee. For some real-world applications, such as epidemiology, there is an underlying diffusion process that also affects geospatial proprieties. We address this with social network optimization problems (SNOPs) where given a weighted, labeled, directed graph we seek to find a set of vertices, that if given some initial property, optimize an aggregate study with respect to such diffusion. We develop and implement a heuristic that obtains a guarantee for a large class of such problems

    Strongly Hierarchical Factorization Machines and ANOVA Kernel Regression

    Full text link
    High-order parametric models that include terms for feature interactions are applied to various data mining tasks, where ground truth depends on interactions of features. However, with sparse data, the high- dimensional parameters for feature interactions often face three issues: expensive computation, difficulty in parameter estimation and lack of structure. Previous work has proposed approaches which can partially re- solve the three issues. In particular, models with factorized parameters (e.g. Factorization Machines) and sparse learning algorithms (e.g. FTRL-Proximal) can tackle the first two issues but fail to address the third. Regarding to unstructured parameters, constraints or complicated regularization terms are applied such that hierarchical structures can be imposed. However, these methods make the optimization problem more challenging. In this work, we propose Strongly Hierarchical Factorization Machines and ANOVA kernel regression where all the three issues can be addressed without making the optimization problem more difficult. Experimental results show the proposed models significantly outperform the state-of-the-art in two data mining tasks: cold-start user response time prediction and stock volatility prediction.Comment: 9 pages, to appear in SDM'1
    • …
    corecore